Vast Mini Challenge 2 - Answer

Jovinka Hartanto
7/13/2021

Question (1)

Using just the credit and loyalty card data, identify the most popular locations, and when they are popular. What anomalies do you see? What corrections would you recommend to correct these anomalies? Please limit your answer to 8 images and 300 words.

Answer

By observing loyalty and credit card data, we can observe that the top three most popular location during the 2 weeks data are Katerina’s Cafe followed by Hippokampos, and Guy’s Gyros.

As showed in the data table below, both loyalty and credit card data shows a consistent result for the popular location.

Stacked bar below shows which date are the most popular for the these popular place based on loyalty and credit card data, which apparently shows a different results.Based on loyalty card transaction data Katerina’s Cafe was popular on 11 January 2014, with 19 transactions from GASTech employees. Hippokampos has the most transaction on 8 January 2014 and lastly Guy’s Gyros has the most transaction on 15 January 2014. However based on credit card data Katerina’s Cafe is most popular on 6 January 2014, Hippokampos on 16 January 2014 and Guy’s Gyros on 13 January 2014

Interactive bar graph was created below by using credit card dara to observe the patterns of visiting the top three locations in different hours of the day. We can see that most of the GASTech employee visited Katerina’s Cafe and Guy’s Gyros during dinner time, on the other hand Hippokampos is more popular during lunch time, except for weekends.

Anomalies

  1. Mismatch between credit card and loyalty card transactions

As we can see above, the number of daily frequency of transactions are different between credit card data and loyalty card data. Further observations was performed and it shows that there are total of 409 un-matched records. These un-matched records might lead to some new clues. Unmatched records can be seen in the table below

  1. Wrong time in credit card transaction data for three coffee shops, Bean There Done That, Brewed Awakenings, and Jack’s Magical Beans

To correct this anomaly, we can add vehicle data to our analysis, we might be able to determine some of the transactions based on the employee location.

Question 2

Add the vehicle data to your analysis of the credit and loyalty card data. How does your assessment of the anomalies in question 1 change based on this new data? What discrepancies between vehicle, credit, and loyalty card data do you find? Please limit your answer to 8 images and 500 words.

Answer

Based on employee location, we can figure out the timing range for cc transaction in Bean There Done That, Brewed Awakenings, Jack’s Magical Beans. Even though there are some transaction in Jack’s Magical Beans that is not found in location data.

Data Discrepancies

  1. There are few credit card and loyalty transactions that cannot be traced from the vehicle data. There is a possibility that employee are not using company vehicle to go to a shop or restaurant.

  2. There are some shop that is not shown in the maps below for example Abila Zacharo, Hippokampos, Kalami Kafenion. Therefore, it was challenging to figure out the position of some location. Matching the vehicle data with credit card data will help us to figure out some unknown locations, however there are still some shops that are not detected, one of the example is Daily Dealz.

Reading layer `Abila' from data source 
  `C:\jovinkahartanto\assignment_distill\MC2\Geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 3290 features and 9 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 24.82401 ymin: 36.04502 xmax: 24.90997 ymax: 36.09492
Geodetic CRS:  WGS 84

Some location that can be traced by using credit card data are Abila Zacharo, Hippokampos, Kalami Kafenion and few other shops.

Reading layer `Abila' from data source 
  `C:\jovinkahartanto\assignment_distill\MC2\Geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 3290 features and 9 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 24.82401 ymin: 36.04502 xmax: 24.90997 ymax: 36.09492
Geodetic CRS:  WGS 84
  1. As mentioned previously there are some mismatch transaction between credit card and loyalty card

QUESTION 3

Can you infer the owners of each credit card and loyalty card? What is your evidence? Where are there uncertainties in your method? Where are there uncertainties in the data? Please limit your answer to 8 images and 500 words.

Answer

Yes, we can infer some of the owners of each credit card and loyalty card by cross-checking credit card and vehicle data using map and later join credit card transaction with loyalty card transaction.

Step by step example to infer the owner of each credit card and loyalty card :

  1. To know the owner of credit card with last4ccnum 2540. We can first see what are the credit card transactions for that credit card.
  1. Compare the data with vehicle data. Data frame stop_fin was created to sort the vehicles that stopped for more than 5 minutes in one place. To find the owner of the credit card, try to find those location that is rarely visited by the employees, in this case it is Chostus Hotel. We can see that there are only 3 employees that visited Chostus Hotel within the 2 weeks data provided. From there we can see which date and time suits the credit card transaction data. If there are more than 1 possibility, try to check another location and compare the date and time from credit card transaction data and gps data.
  1. Repeat the steps above for another credit card data.
  2. Join the completed credit card data (with owner’s id or name) with loyalty data to infer the owner of the loyalty card.
  1. Check the first name and last name for which id to know the full name of the credit card and loyalty card owner.

With this method, not all credit card and loyalty card owner can be known. There are some scenarios that make the transaction data hard to be traced.

  1. There is also a possibility that some people going to the shop by using another vehicle (not company vehicle) or for some coffee/breakfast place, some employee might walk there since it is very near from their house

  2. Employees did go to the shop by company vehicle, however, it is not them who paid the bill

  3. Not all employees paid the bill using credit card, there might be some people who use their loyalty card, but paid the bill using cash

QUESTION 4

Given the data sources provided, identify potential informal or unofficial relationships among GASTech personnel. Provide evidence for these relationships. Please limit your response to 8 images and 500 words.

Answer

  1. Four security employees, Inga Ferro, Loreto Bodrogi, Hennie Osvaldo, and Minke Mies. They are all securities from two different department site control and perimeter control, however they repeatedly visited these 5 unknown place.
Reading layer `Abila' from data source 
  `C:\jovinkahartanto\assignment_distill\MC2\Geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 3290 features and 9 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 24.82401 ymin: 36.04502 xmax: 24.90997 ymax: 36.09492
Geodetic CRS:  WGS 84

They usually visited the meeting point during lunch time, around 11:30 AM to 12:30 PM. There are also few cases where two or three of them visited these places at the same times.

  1. Loretto and Minke visited Meeting Point 1 on 8 January 2014. Loretto visited this place for 11 minutes and Minke visited for 37 minutes
  2. Hennie and Inga visited Meeting Point 4 at the same time on 10 January 2014, the timing they were visiting are 11:28 AM to 12:12 PM and 11:26 AM to 12:16 PM respectively
  3. Inga, Loretto, Hennie visited Meeting Point 6 at the same time period on 15 January 2014
  4. Hennie and Minke visited Meeting Point 5 with an overlapped time period on 16 January 2014
  5. Inga and Loreto visited Meeting Point 1 together on 17 January 2014

The date, time, and duration of these 4 employees visiting the 5 unknown places can be seen in the interactive graph below.

  1. Brand Tempestad and Elsa Orilla

As we can see in the graph below, Brand and Elsa always come to Chostus Hotel during lunch time. They came to the hotel with a similar timing, there is a possibility that there is an informal relationship between them.

##Question 5 Do you see evidence of suspicious activity? Identify 1- 10 locations where you believe the suspicious activity is occurring, and why Please limit your response to 10 images and 500 words.

Answer

  1. Meeting Point 1,2,4,5,6 As mentioned in Question 4 there are 5 unknown places that repeatedly visited by 4 employees

  2. GASTech Technologies Data table below shows that car id 1, which is Nils Calixto visit GASTech at night, later than 8 PM. There are days that Nils visited GASTech at after 23:00 PM, which is very suspicious.It is also unusual for Truck 104 come to GASTech at 8 PM

  1. House of Lars Azada Unusual pattern also happened on 10 January 2014 at Lars Azada’s House. There are some employees visited his house between 18:00 to 1:00 AM the next day, which most of them are from IT and Engineering Department. All IT employee are there except Sven Flecha.

Nils Calixto also visited Lars’ House on 7 January 2014 at 3 AM, which is very suspicious.

  1. Chostus Hotel

There are three employees who visited Chostus Hotel, which are Sten Sanjorge Jr. (id:31), Brand Tempestad (id:33), and Elsa Orilla (id:7). However as explained in question 4, only Brand and Elsa are visited the hotel together during lunch time on 8,10,14, and 17 January 2014.

Meanwhile, the remaining visits belongs to Sten Sanjorge Jr s, who stayed in the hotel from 17 January 2014 to 19 January 2014. It can be seen from the time duration that the GPS tracked that Sten’s car syp for almost 19 hours on 17 January, and 23 hours on 18 January 2014.